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INTRODUCTION 


Tuberous  sclerosis  complex  (TSC)  is  an  autosomal  dominant  disease  characterized  by  benign  tumors  in 
various  tissues.  The  genes  mutated  in  this  disease,  TSC1  and  TSC2,  encode  tumor  suppressors  that  are 
associated  in  a  complex.  The  TSC1/2  complex,  through  its  Rheb-GAP  activity,  is  a  critical  negative  regulator 
of  mTORCI  under  physiological  conditions.  Activation  of  mTORCI  positively  stimulates  cap-dependent  mRNA 
translation  via  its  downstream  substrates  S6K  and  4E-BP.  In  our  previous  study,  we  demonstrated  that  TSC- 
mTORCI  signaling  regulates  the  balance  between  cap-dependent  and  cap-independent  translation.  In  this 
project,  we  aim  to  elucidate  c/s-regulatory  elements  and  trans- acting  factors  in  TSC-mTOR  pathway-mediated 
translational  regulation. 
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BODY 


Task  1.  Define  TSC-mediated  translational  regulation  using  genome-wide  ribosome  profiling 

la.  Elucidate  the  ribosome  dynamics  in  response  to  TSC-mTOR  pathway 

•  Determine  the  pattern  of  ribosome  pausing  in  TSC2  KO  cells  A  reliable  measure  of  the  translation 
of  cellular  mRNA  is  the  degree  of  its  association  with  ribosomes.  Actively  translated  mRNAs  are  typically 
bound  by  several  ribosomes  (polysome)  and  can  be  separated  from  individual  40S  and  60S  ribosomal  subunits 
and  80S  monosomes  by  centrifugation  through  a  linear  sucrose  gradient.  Ribosome  profiling,  based  on  deep 
sequencing  of  ribosome-protected  mRNA  fragments  (RPF),  has  proven  to  be  powerful  in  defining  ribosome 
positions  on  the  entire  transcriptome  [1 ,2],  To  investigate  the  ribosome  dynamics  during  mRNA  translation, 
we  have  established  a  modified  ribosome  profiling  technique  (Ribo-seq)  adapted  from  the  previously  published 
protocol  (Fig.  1A).  Direct  comparison  of  RPFs  from  both  monosome  and  polysome  fractions  afforded  us  an 
opportunity  to  dissect  the  transition  between  initiation  and  elongation. 


We  built  a  heat  map  of  RPF  density  over  the  entire  transcriptome  for  both  TSC2  WT  and  TSC2  KO 
cells  (Fig.  1 B).  In  addition  to  the  clear  enrichment  of  RPFs  at  the  start  of  transcripts  [3],  a  significant  portion  of 
footprints  were  also  located  at  several  downstream  codons,  with  another  prominent  peak  of  reads  positioned  at 
+4  codon  (Fig.  1C).  These  results  strongly  suggest  the  existence  of  a  post-initiation  pausing  after  the 
assembly  of  80S  ribosome  at  the  initiator  codon.  Interestingly,  TSC2  KO  cells  showed  a  substantial  decrease 
of  RPF  density  around  the  initiation  codon  as  compared  to  the  wild  type  (Fig.  IB,  1C).  In  particular,  the 
second  pause  peak  at  +4  codon  was  nearly  diminished  in  cells  lacking  TSC2  (Fig.  1C).  Ribosome  pausing 
has  long  been  attributed  to  the  presence  of  specific  sequence  features,  such  as  rare  codons  or  RNA 
secondary  structures  [4-6],  Our  results  suggest  that  the  ribosome  pausing  could  also  be  subject  to  regulation 
by  nutrient  signaling.  It  appears  that  mTORCI  not  only  promotes  the  ribosome  loading  via  cap-dependent 
mechanism,  but  also  reduces  the  initiation  pausing  of  80S  ribosomes,  resulting  in  a  faster  transition  between 
initiation  and  elongation. 
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Visualization  of  ribosome 
dynamics  at  codon  resolution. 
(A)  Schematic  for  genome¬ 
wide  ribosome  foot-printing 
technology.  (B)  Heatmap  of 
RPF  density  over  the  entire 
transcriptome  of  TSC2  WT  and 
TSC2  KO  cells.  For  the 
purpose  of  clarity,  only  the  first 
100  codon  and  the  last  100 
codon  were  shown  in  a  gnome¬ 
wide  scale.  Blue  color  scale 
indicates  the  RPF  density, 
white  area  means  no  RPF 
reads,  whereas  yellow 
indicates  no  sequence  region. 
(C)  Normalized  RPF  density 
over  the  entire  length  of  whole 
transcriptome  from  monosome 
(top  panel)  and  polysome 
(bottom  panel). 


•  Determine  the  role  of  TOP  sequence  in  ribosome  pausing  By  calculating  the  pausing  index 

across  the  entire  transcriptome  (Fig.  1),  we  found  that  a  subset  of  transcripts  had  an  exceptionally  high 
initiation  pause.  Of  interest,  we  observed  a  strong  propensity  of  initiation  pausing  on  transcripts  encoding 
ribosome  proteins  (RP)  (data  not  shown).  A  common  feature  of  RP  transcripts  is  the  presence  of  a 
characteristic  5’UTR:  an  uninterrupted  sequence  of  6-12  pyrimidines  at  the  5’  end  called  terminal 
oligopyrimidine  (TOP)  sequence  [7,  8].  Importantly,  the  TOP  motif  is  necessary  for  a  growth-associated 
translational  regulation  of  RP  mRNAs  [8].  However,  the  underlying  molecular  mechanism  remains 
controversial.  Toexamine  whether  the  presence  of  the  TOP  sequence  is  sufficient  to  convey  the  unique 
feature  of  initiation  pausing  to  irrelevant  transcripts,  we  cloned  the  TOP  sequence  of  Rpl15  into  the  5’end  of  a 
reporter  gene  encoding  Luc.  We  also  included  the  Gapdh  5’UTR  as  a  negative  control.  After  ribosome 
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profiling  of  transfected  cells,  we  compared  the  distribution  of  ribosome  density  on  each  Luc  transgene  bearing 
different  5’UTR  and  found  positive  correlation  between  the  presence  of  TOP  sequence  and  initiation  pausing. 

1b.  Determine  the  selection  of  translation  initiation  sites  in  response  to  TSC-mTOR  pathway 

•  Established  the  global  mapping  of  translation  initiation  sites  A  recent  study  used  an  initiation- 
specific  translation  inhibitor  harringtonine  to  deplete  elongating  ribosomes  from  mRNAs,  thereby  halting 
ribosomes  at  the  initiation  codons  [9],  This  approach  uncovered  an  unexpected  abundance  of  alternative  TIS 
codons,  in  particular  non-AUG  codons  in  the  5’UTR.  However,  these  data  was  “noisy”  and  required  a 
machine-learning  algorithm  to  identify  TIS  codons.  We 
develop  global  translation  initiation  sequencing  (GTI-seq) 
by  utilizing  two  related  but  distinct  translation  inhibitors  to 
effectively  differentiate  ribosome  initiation  from  elongation. 

While  cycloheximide  (CHX)  freezes  all  translating 
ribosomes,  the  translation  inhibitor  lactimidomycin  (LTM) 
preferentially  acts  on  the  initiating  ribosome  but  not  the 
elongating  ribosome  (Fig.  2).  LTM  bears  several 
advantages  over  harringtonine  in  achieving  the  high 
resolution  mapping  of  global  TIS  positions.  First,  LTM 
binds  to  the  80S  ribosome  already  assembled  at  the 
initiation  codon  and  permits  the  first  peptide  bond 
formation  [10].  Thus,  the  LTM-associated  RPF  more  likely 
represents  physiological  TIS  positions.  Second,  LTM 
occupies  the  empty  E-site  of  initiating  ribosomes  and  thus 
completely  blocks  the  translocation.  This  feature  allows 
the  TIS  identification  at  single  nucleotide  resolution. 

Third,  owing  to  the  similar  structure  and  the  same 
binding  site  in  the  ribosome  [10],  LTM  and  CHX  can  be 
applied  side-by-side  to  achieve  simultaneous 
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assessment  of  both  initiation  and  elongation  for  the 
same  transcript.  With  the  high  signal/noise  ratio,  GTI- 
seq  offers  a  direct  TIS  identification  approach  with  a 
minimal  computational  aid. 


Fig.  2.  Experimental  Strategy  of  GTI-seq  Using  Ribosome 
E-site  Translation  Inhibitors 

Pretreatment  of  HEK293  cells  with  either  100  pM  CHX,  or  50 
pM  LTM  resulted  in  different  patterns  of  RPFs  as  revealed  by 
metagene  analysis.  CHX-associated  RPFs  are  mainly 
located  in  the  body  of  coding  region.  Remarkably,  LTM- 
associated  RPFs  are  enriched  at  the  annotated  start  codon. 
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•  Characterize  alternative  translation  From  4,000  transcripts  with  detectable  TIS  peaks,  we  identified  a 
total  of  16,231  TIS  sites.  Codon  composition  analysis  revealed  that  more  than  half  of  the  TIS  codons  used 
AUG  as  the  translation  initiator  (Fig.  3A).  GTI-seq  also 
identified  a  significant  proportion  of  TIS  codons  employing 
near-cognate  codons  that  differ  from  AUG  by  a  single 
nucleotide,  in  particular  CUG  (16%).  Remarkably,  nearly  half 
of  the  transcripts  (42%)  contained  multiple  TIS  sites  (Fig.  3B), 
suggesting  that  alternative  translation  prevails  even  under 
physiological  conditions.  In  addition  to  validating  initiation  at 
the  annotated  start  codon  (aTIS),  GTI-seq  revealed  39%  of 
the  transcripts  containing  downstream  initiation  sites  (dTIS) 
and  54%  of  transcripts  bearing  upstream  TIS  positions  (uTIS). 

While  dTIS  codons  use  the  conventional  AUG  as  the  main 
initiator,  a  significant  fraction  of  uTIS  codons  are  non-AUG 
with  the  CUG  as  the  most  frequent  one.  We  experimentally 
validated  different  translational  products  initiated  from 
alternative  start  codons,  including  non-AUG  (data  not  shown). 
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Fig.  3.  Global  TIS  identification  by  GTI-seq.  (A) 

Codon  composition  of  all  TIS  codons  identified  by 
GTI-seq.  (B)  Histogram  showing  the  overall 
distribution  of  TIS  number  identified  on  each 
transcript. 
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Task  2.  Define  the  impacts  of  TSC-mTOR  pathway  in  protein  quality  and  quantity  control 


2a.  Determine  how  TSC-mTOR  pathway  the  quality  of  newly  synthesized  proteins 


To  investigate  how  nutrient  signaling  affects  the  folding  of  nascent  chains,  we  used  firefly  luciferase  (Luc)  as  a 
reporter  because  of  its  high  chaperone  dependency  and  stress  sensitivity  [11,  12].  Cells  lacking  tuberous 
sclerosis  complex  2  (TSC2)  exhibit  constitutively  active  mTORCI  signaling  [13,  14],  In  spite  of  the  increased 
protein  synthesis,  we  observed  a  much  lower  Luc  activity  in  TSC2  KO  cells  than  in  wild  type  cells  (Fig.  4A). 
Proteasome  inhibition  by  MG132  treatment  caused  a  significant  accumulation  of  Luc  (Fig.  4B),  indicating  the 
reduced  stability  of  Luc  in  cells  with  hyperactive  mTORCI  signaling.  Remarkably,  reducing  mTORCI  signaling 
by  specific  inhibitor  rapamycin  was  able  to  rescue  the  steady-state  levels  of  Luc  (Fig.  4B).  The  reduced 
protein  quality  in  TSC2  KO  cells  was  not  limited  to  Luc.  A  significant  amount  of  polyubiquitinated  species  were 
also  accumulated  in  these  cells  after  proteasome  inhibition  as  compared  to  the  wild  type  cells  (Fig.  4C).  These 
results  strongly  suggest  that  hyperactive  mTORCI  signaling  increases  the  yield  of  protein  synthesis  at  the 
expense  of  folding  quality. 


Fig  4.  Hyperactive  mTORCI  signaling  reduces  the  quality  of  newly  synthesized  proteins.  (A)  TSC2  WT  and  TSC2  KO 
cells  were  transfected  with  either  Luc  plasmid  or  mRNA.  Real  time  Luc  activity  was  recorded  immediately  after  transfection 
(B)  Luc-transfected  TSC2  WT  and  TSC2  KO  cells  were  treated  with  MG132  or  DMSO  followed  by  immunoblotting  analysis 
using  antibodies  indicated.  (C)  Whole  cell  lysates  of  TSC2  WT  and  TSC2  KO  cells  were  immunoblotted  to  detect  ubiquin- 
conjugated  species. 


2b.  Determine  how  TSC-mTOR  pathway  affects  translation  fidelity 


Translation  fidelity  can  be  thought  of  a  competition  between  the  cognate  and  near-cognate  tRNAs  for  a  given 
codon.  It  is  conceivable  that  the  increased  translation  speed  under  hyperactive  mTORCI  signaling  generates 
more  error  proteins  via  misreading  of  genetic  codons.  We  analyzed  two  different  aspects  of  translation  fidelity: 
nonsense  suppression  and  aa-tRNA  selection.  To  evaluate  nonsense  suppression  we  constructed  a  pGL3- 
Luc  vector  with  a  stop  codon  at  the  coding  region  of  Luc  (Fig.  5).  To  evaluate  the  aa-tRNA  selection  we  made 
a  vector  in  which  the  AGA  codon  of  Luc  at  aa218  was 


mutated  to  the  near-cognate  AGC  codon.  The  R218  residue 
is  a  critical  amino  acid  for  Luc  activity  and  this  mutation 
potently  reduces  the  luciferase  activity.  Measuring  the  Luc 
activity  level  of  R218A  mutant  allows  us  to  evaluate  the  rate 
of  amino  acid  misincorporation  at  this  position  [15],  As 
expected,  both  Fluc(Stop)  and  Fluc(R218A)  mutants 
showed  less  than  1%  of  enzymatic  activities  of  the  wild  type 
Flue.  Despite  such  low  expression  levels,  TSC2  KO  cells 
showed  an  increase  in  Flue  activity  for  both  Fluc(Stop)  and 
Fluc(R218A)  when  compared  to  the  wild  type  cells  (2  fold 
and  1.5  fold  increase,  respectively)  (Fig.  5).  These  results 
suggest  that  the  ribosomes  in  cells  with  hyperactive 
mTORCI  signaling  have  a  higher  rate  of  altered  aa-tRNA 
selection. 


Fluc(stop)  Fluc(R218S) 

Fig  5.  TSC2  WT  (left  panel)  and  KO  cells  (right  panel) 
were  transfected  with  plasmids  encoding  Flue  mutants  as 
in  (C).  After  24  hr  of  transfection,  cells  were  treated  with 
20nM  rapamycin  for  15  hr  followed  by  measurement  of 
Flue  activities.  Relative  Flue  activities  were  normalized 
using  wild  type  Flue  (mean  ±  SEM;  n=2). 
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KEY  RESEARCH  ACCOMPLISHMENTS 


•  Genome-wide  ribosome  profiling  reveals  regulated  post-initiation  pausing 

•  Role  of  TOP  sequence  in  post-initiation  pausing 

•  Global  mapping  of  alternative  translation  initiation  using  GTI-Seq 

•  GTI-seq  reveals  prevailing  alternative  translation 

•  TSC-mTORCI  increases  the  yield  of  protein  synthesis  at  the  expense  of  protein  quality 

•  TSC-mTORCI  controls  translational  fidelity 
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CONCLUSIONS 


The  observations  described  in  this  study  have  several  implications.  First,  the  critical  role  of  TSC  in  ribosome 
dynamics  and  translation  quality  extends  the  molecular  linkage  between  mTORCI  and  protein  homeostasis. 
Second,  we  find  distinct  roles  for  mTORCI  downstream  targets  in  maintaining  protein  homeostasis.  Loss  of 
S6  kinases,  but  not  4E-BP  family  proteins,  diminishes  the  effects  of  rapamycin  on  the  quality  of  translational 
products.  Third,  the  finding  that  an  increase  in  protein  synthesis  is  accompanied  by  a  decrease  in  protein 
quality  provides  a  plausible  mechanism  for  how  persistent  mTORCI  signaling  favors  the  development  of  age- 
related  pathologies.  With  the  most  common  feature  of  aging  being  an  accumulation  of  misfolded  proteins 
derived  from  erroneous  biosynthesis  and  post-synthetic  modification,  protein  homeostasis  is  an  important 
mediator  of  rapamycin  in  longevity. 

So  this  study  expands  our  knowledge  about  the  role  of  TSC-mTOR  signaling  in  translational  control  of  gene 
expression.  The  distinct  roles  for  mTORCI  downstream  targets  in  maintaining  protein  homeostasis  reveal  a 
mechanistic  connection  between  mTORCI  and  protein  homeostasis. 
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How  the  ribosome-bound  nascent  chain  folds  to  assume  its 
functional  tertiary  structure  remains  a  central  puzzle  in  biology. 
In  contrast  to  refolding  of  a  denatured  protein,  cotranslational 
folding  is  complicated  by  the  vectorial  nature  of  nascent  chains, 
the  frequent  ribosome  pausing,  and  the  cellular  crowdedness. 
Here,  we  present  a  strategy  called  folding-associated  cotransla¬ 
tional  sequencing  that  enables  monitoring  of  the  folding  compe¬ 
tency  of  nascent  chains  during  elongation  at  codon  resolution.  By 
using  an  engineered  multidomain  fusion  protein,  we  demonstrate 
an  efficient  cotranslational  folding  immediately  after  the  emer¬ 
gence  of  the  full  domain  sequence.  We  also  apply  folding- 
associated  cotranslational  sequencing  to  track  cotranslational 
folding  of  hemagglutinin  in  influenza  A  virus-infected  cells.  In 
contrast  to  sequential  formation  of  distinct  epitopes,  the  receptor 
binding  domain  of  hemagglutinin  follows  a  global  folding  route 
by  displaying  two  epitopes  simultaneously  when  the  full  sequence 
is  available.  Our  results  provide  direct  evidence  of  domain-wise 
global  folding  that  occurs  cotranslationally  in  mammalian  cells. 

deep  sequencing  |  ribosome  profiling  |  protein  quality 

It  is  currently  believed  that  protein  folding  generally  begins 
during  translation  on  the  ribosome  (1,  2).  In  mammalian  cells, 
the  rate  of  protein  synthesis  is  approximately  five  residues  per 
second,  whereas  folding  is  typically  occurring  on  the  microsec¬ 
ond  scale  (3,  4).  Thus,  many  details  of  cotranslational  folding 
pathway  remain  elusive.  For  example,  what  types  of  the  struc¬ 
tures  and/or  intermediates  are  formed  in  the  nascent  chain 
during  cotranslational  folding?  How  early  in  translation  are 
these  structures  formed?  In  contrast  to  the  in  vitro  refolding  of 
full-length  polypeptides,  the  cotranslational  folding  of  emerging 
polypeptides  is  influenced  by  their  sequential  exposure  from  the 
ribosome  exit  tunnel  to  the  cytosol  (5).  Cotranslational  folding  is 
further  complicated  by  frequent  ribosome  pausing  (6),  as  well  as 
interactions  with  cellular  binding  partners  (7). 

Traditional  methods  of  detecting  cotranslational  folding  rely 
on  monitoring  of  the  enzymatic  activity  of  model  proteins  syn¬ 
thesized  in  vitro  (1,  2).  These  assays  are  impractical  when  applied 
to  cells  under  physiological  conditions.  A  few  in  vivo  experiments 
supporting  cotranslational  folding  were  based  on  pulse-chase 
metabolic  labeling  coupled  with  folding-dependent  cleavage 
analysis  (8).  A  limitation  of  this  approach  is  low  resolution. 
Fluorescence-based  techniques,  such  as  FRET,  allow  detection 
of  cotranslational  folding  and  interactions  of  nascent  chains  with 
high  resolution  (9).  However,  FRET  measurements  require  in¬ 
corporation  of  modified  amino  acids  and  are  limited  to  cell-free 
systems.  None  of  these  methods  can  be  used  for  simultaneous 
monitoring  of  cotranslational  folding  of  nascent  chains  with 
varied  lengths  in  vivo.  We  developed  an  approach  called  folding- 
associated  cotranslational  sequencing  (FactSeq)  that  overcomes 
many  of  these  deficiencies.  By  harnessing  the  power  of  the  ri¬ 
bosome  profiling  technique  (10),  FactSeq  allows  us  to  dissect  at 
what  point  during  translation  the  nascent  chain  acquires  a  spe¬ 
cific  conformation. 


Results  and  Discussion 

During  translation  elongation,  the  positions  of  ribosomes  on 
a  given  mRNA,  and  hence  the  length  of  the  synthesized  poly¬ 
peptide  chain,  can  be  determined  by  deep  sequencing  of  the  ri¬ 
bosome-protected  mRNA  fragments  (RPFs)  (10,  11).  FactSeq  is 
based  on  enriching  ribosomes  bearing  nascent  chains  with  rec¬ 
ognizable  structural  features,  followed  by  sequencing  of  the  as¬ 
sociated  RPFs.  Direct  comparison  of  RPF  distribution  before 
and  after  ribosome  enrichment  provides  sequence-specific  struc¬ 
tural  information  associated  with  the  nascent  chain.  To  pilot  this 
technique,  we  generated  a  HEK293  cell  line  stably  expressing  the 
multidomain  fusion  protein  Flag-FRB-GFP,  in  which  the  well- 
characterized  FKBP12-rapamycin  binding  domain  (FRB)  was 
fused  to  the  NH2  terminus  of  GFP  (Fig.  1).  After  collecting  the 
ribosome  fractions  from  the  whole -cell  lysates  by  using  sucrose 
gradient  sedimentation,  we  converted  polysomes  to  single  ribo¬ 
somes  by  RNase  I  treatment  to  digest  mRNAs  not  protected  by 
the  ribosome.  We  first  enriched  ribosomes  bearing  the  NH2- 
terminal  Flag-tagged  nascent  chain  by  immunoprecipitation  (IP) 
using  anti-Flag  mAb-coated  beads.  After  extracting  Flag  tag-as¬ 
sociated  RPFs  as  well  as  total  RPFs  from  the  same  sample,  we 
constructed  a  cDNA  library  suitable  for  Illumina  high-through¬ 
put  sequencing  (Fig.  1). 

The  sequencing  results  of  RPFs  obtained  with  or  without  Flag  IP 
were  of  similar  quality  (Fig.  SI).  As  expected,  the  majority  of  RPFs 
were  approximately  30  nt  in  length.  The  5'  end  positions  of  RPF 
showed  a  strong  3-nt  periodicity,  confirming  that  the  RPF  accu¬ 
rately  captures  the  ribosome  movement  along  mRNAs.  By  using 
RPFs  derived  from  the  entire  transcriptome,  we  built  a  ribosome 
density  map  on  the  Flag-FRB-GFP  transcript  (Fig.  2A).  Consis¬ 
tent  with  the  nonuniform  rates  of  translation  elongation,  the 
transcript  was  punctuated  with  multiple  ribosome  pausing  sites 
with  a  skewed  number  of  reads  located  at  the  start  codon  region. 
Notably,  the  linker  region  between  FRB  and  GFP  showed  the 
least  RPF  reads,  possibly  because  of  our  selection  of  commonly 
used  codons  in  creating  the  construct.  Over  four  independent 
RPF  deep-sequencing  replicates,  the  ribosome  distribution  pat¬ 
tern  on  individual  transcripts  was  highly  reproducible  (Fig.  S2). 

As  the  NH2-terminal  Flag  tag  is  present  at  the  start  of  the 
nascent  chain,  the  Flag  mAb-associated  RPFs  should  capture 
nearly  all  the  ribosome  footprints  during  elongation.  Alignment 
of  RPF  reads  on  Flag-FRB-GFP  transcript  before  and  after  Flag 
IP  revealed  a  nearly  identical  pattern  of  ribosome  density  except 
the  first  50  codon  region  (Fig.  24,  Lower).  As  ~40  aa  of  the 
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Fig.  1.  Schematic  for  FactSeq  approach.  The  polysomes  from  HEK293/Flag-FRB-GFP  are  converted  into  monosome  by  RNase  I  treatment,  followed  by  IP  using 
anti-Flag  or  recombinant  FKBP  in  the  presence  or  absence  of  rapalog.  The  RPFs  are  extracted  and  mixed  with  spike-in  control  before  cDNA  library  con¬ 
struction.  The  deep  sequencing  results  of  RPFs  are  analyzed  by  transcriptome  mapping.  {Inset)  Circle  depicts  structure  of  FKBP  (blue,  PDB  1A7X)  and  FRB  (red, 
PDB  1AUE)  dimerization  in  the  presence  of  rapamycin  (green). 


growing  polypeptide  chain  are  buried  within  the  ribosome  exit 
tunnel,  this  corresponds  well  to  the  minimal  length  of  10  aa  re¬ 
quired  for  antibody  binding  (i.e.,  the  full  length  of  Flag  tag). 
Notably,  there  was  little  reduction  of  Flag  IP-associated  RPF 
reads  relative  to  total  RPFs  along  the  remaining  part  of  the 
transcript,  indicating  that  the  Flag  tag  remains  intact  during  the 
synthesis  of  Flag-FRB-GFP.  This  argues  that  cotranslational 
degradation  is  minimal  in  this  multidomain  fusion  protein.  Thus, 
the  FactSeq  approach  allows  tracking  of  the  behavior  of  nascent 
chains  with  high  accuracy  and  sensitivity. 

We  next  extended  the  FactSeq  approach  to  evaluate  cotrans¬ 
lational  folding  of  the  FRB  domain  before  the  complete  syn¬ 
thesis  of  GFP.  To  probe  the  folding  status  of  FRB  domain,  we 
took  advantage  of  its  binding  partner  FKBP  (FK506  binding 
protein).  The  dimerization  of  FKBP  and  FRB  relies  on  their  3D 
structures  and  the  presence  of  rapamycin  or  rapalog  (12,  13) 


(Fig.  1).  As  expected,  recombinant  FKBP  synthesized  in 
Escherichia  coli  specifically  pull  down  the  Flag-FRB-GFP  fusion 
protein  in  a  rapalog-dependent  manner  (Fig.  S3).  Thus,  FKBP- 
rapalog  can  be  used  as  a  bait  to  probe  the  folding  status  of  FRB 
before  the  full-length  fusion  protein  is  released  from  the  ribo¬ 
some.  Consistent  with  the  high  specificity  of  rapalog-mediated 
FRB-FKBP  interaction,  very  few  RPF  reads  were  recovered  in 
the  absence  of  rapalog  (Fig.  2 B).  By  contrast,  adding  rapalog 
selectively  restored  a  significant  number  of  RPFs  starting  at  the 
150  codon  position  of  the  Flag-FRB-GFP  transcript.  Given  10  aa 
from  the  Flag  tag  and  40  aa  buried  in  the  ribosome  tunnel,  the 
appearance  of  RPF  reads  after  150  codon  position  corresponds 
to  the  minimal  length  of  100  aa  at  which  the  FRB  domain  starts 
to  create  the  rapalog  binding  site  and  associate  with  FKBP. 
Thus,  FRB  domain  is  able  to  fold  when  the  corresponding  amino 
acid  sequence  immediately  emerges  from  the  ribosome. 
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Fig.  2.  Monitoring  cotranslational  behavior  of  Flag-FRB-GFP  polypeptide  in  mammalian  cells.  (A)  Comparison  of  RPF  distribution  on  Flag-FRB-GFP  transcript 
before  and  after  anti-Flag  IP.  Both  the  total  and  Flag  IP-associated  RPFs  are  aligned  based  on  the  sequence  position  of  Flag-FRB-GFP.  Lower:  Pattern  analysis 
using  single  codon  peak  ratio  (dot  plot)  and  significance  (P  value)  of  RPF  density  vs.  background  in  a  10-codon  sliding  window  (field  plot).  The  line  plot 
represents  the  LOESS-smoothed  trend  line  for  single  codon  peak  ratio  (sampling  proportion,  0.2).  The  colored  areas  represent  regions  of  nascent  chains  that 
inaccessible  to  anti-Flag  antibody.  Cutoff  line  was  set  at  P=  0.001  (green  dashed  line).  (B)  Comparison  of  RPF  distribution  on  Flag-FRB-GFP  transcript  before 
and  after  FKBP  affinity  purification.  (Upper)  Alignment  of  RPFs  associated  with  total  and  FKBP  binding  in  the  presence  or  absence  of  rapalog.  Lower:  Pattern 
analysis  using  single  codon  peak  ratio  (dot  plot)  and  significance  (P  value)  of  RPF  density  vs.  background  in  a  10-codon  sliding  window  (field  plot).  The  line 
plot  represents  the  LOESS-smoothed  trend  line  for  single  codon  peak  ratio  (sampling  proportion,  0.2).  The  colored  areas  represent  regions  of  nascent  chains 
inaccessible  to  FKBP.  Cutoff  line  was  set  at  P  =  0.001  (green  dashed  line). 
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Fig.  3.  Monitoring  cotranslational  folding  of  HA  in  influenza  A-infected  cells.  (A)  Epitope  sites  on  the  RBD  domain  of  HA.  The  structure  of  monomeric  HA  is 
shown  as  a  ball-and-stick  model  with  HA1  in  light  brown  and  HA2  in  light  blue  (PDB  1RU7).  The  Sa  site  recognized  by  Y8-10C2  is  shown  in  red,  whereas  the  Sb 
site  by  H28-E23  is  dark  blue.  The  RBD  domain  is  identified  from  R53  to  E276.  ( B )  Comparison  of  RPF  distribution  on  HA  transcript  derived  from  PR8  H1N1 
influenza  A  before  and  after  IP.  Polysome  fractions  were  prepared  from  influenza  A-infected  HeLa  cells.  IP  was  performed  by  using  a  panel  of  antibodies, 
followed  by  deep  sequence  of  RPFs.  Lower :  Pattern  analysis  after  IP  with  Y8-10C2  (red)  and  H28-E23  (blue)  using  single  codon  peak  ratio  (dot  plot)  and  sig¬ 
nificance  (P  value)  of  RPF  density  vs.  background  in  a  10-codon  sliding  window  (field  plot).  The  line  plot  represents  the  LOESS-smoothed  trend  line  for  single 
codon  peak  ratio  (sampling  proportion,  0.2).  The  colored  areas  represent  regions  of  nascent  chains  inaccessible  to  antibodies.  Cutoff  line  was  set  at  P  =  0.001 
(green  dashed  line).  (C)  Comparison  of  RPF  distribution  on  HA  transcript  derived  from  CV1  mutant  that  escapes  the  Y8-10C2  recognition.  {Lower)  Pattern 
analysis  after  IP  with  H28-E23  (blue)  using  single  codon  peak  ratio  (dot  plot)  and  significance  (P  value)  of  RPF  density  vs.  background  in  a  10-codon  sliding 
window  (field  plot).  The  line  plot  represents  the  LOESS  smoothed  trend  line  for  single  codon  peak  ratio  (sampling  proportion,  0.2).  Cutoff  line  was  set  at 
P  =  0.001  (green  dashed  line).  (D)  A  model  of  domain-wise  global  folding  of  the  HA  nascent  chain  attached  to  the  ribosome.  Rapid  cotranslational  folding 
occurs  only  after  the  full  RBD  domain  sequence  is  available.  The  blue  ovals  represent  possible  binding  partners  of  nascent  chains,  such  as  molecular  chaperones. 
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Intriguingly,  there  was  a  significant  reduction  of  FKBP-asso- 
ciated  RPF  reads  after  200  codon  position  of  the  Flag-FRB-GFP 
transcript  relative  to  total  RPFs  (5  =  6.256  X  10-5;  Fig.  25, 
Lower).  We  cannot  attribute  this  to  cotranslational  degradation, 
because  the  NFF-terminal  Flag  tag  was  continuously  present 
(Fig.  2A).  Rather,  the  appearance  of  GFP  polypeptide  could 
partially  prevent  folded  FRB  from  interacting  with  FKBP,  pos¬ 
sibly  by  associating  molecular  chaperones  or  other  binding 
partners  (5). 

Having  successfully  monitored  the  cotranslational  folding  of 
FRB  domain,  we  next  applied  FactSeq  to  evaluate  the  cotrans¬ 
lational  folding  process  of  influenza  A  hemagglutinin  (HA).  In¬ 
fluenza  presents  a  serious  public  health  challenge,  and  HA  is 
a  prime  candidate  for  vaccine  design  and  drug  development  (14). 
HA  is  a  type  I  transmembrane  glycoprotein  with  multiple  folding 
domains.  The  protein  is  cotranslationally  glycosylated  in  the 
endoplasmic  reticulum.  After  posttransaltional  trimerization,  the 
HA  trimer  then  traffics  to  the  cell  surface.  The  crystal  structure 
of  FIA  reveals  a  globular  ectodomain  sitting  atop  an  extended 
stalk  (Fig.  3H).  The  FIA  mediates  attachment  to  cells  via  its  re¬ 
ceptor  binding  site  in  the  ectodomain  [i.e.,  receptor-binding 
domain  (RBD)],  which  also  contains  most  of  the  epitopes  for 
neutralizing  antibody  recognition  (15).  All  known  neutralizing 
anti-HA  mAbs  recognize  conformational  determinants,  i.e.,  their 
epitopes  are  formed  by  folding  of  the  primary  sequences.  For 
instance,  the  Y8-10C2  mAb  recognizes  an  epitope  on  the  HA 
comprising  residues  from  E158  to  Y168  [Puerto  Rico/8/34 
(PR8)],  whereas  the  H28-E23  epitope  is  principally  formed  by 
residues  from  N187  to  E198  (16)  (Fig.  3/4).  As  these  mAbs  are 
capable  of  binding  HA  monomers  based  on  local  folding  of  their 
respective  domains,  we  used  them  to  probe  cotranslational  for¬ 
mation  of  distinct  epitopes  during  the  biosynthesis  of  HA.  As 
a  negative  control,  we  used  the  H17-L2  mAb,  whose  binding  is 
dependent  on  trimerization,  as  its  epitope  is  formed  by  residues 
on  each  side  of  the  trimer  interface  (17). 

We  purified  ribosomes  5  h  after  viral  infection  of  HeLa  cells 
with  PR8  (Fig.  S4).  The  ribosome  fractions  were  then  immuno- 
precipitated  using  each  of  the  mAbs  followed  by  deep  sequenc¬ 
ing  of  RPFs.  Total  RPF  reads  aligned  to  the  full  length  of  HA 
transcript  exhibited  typical  pausing  sites  (Fig.  35).  Trimer-spe- 
cific  H17-L2  set  the  background  level  of  reads.  In  contrast,  both 
Y8-10C2  and  H28-E23  recovered  a  large  number  of  RPFs  over 
background.  Thus,  FactSeq  could  be  used  to  investigate  the 
cotranslational  folding  of  endoplasmic  reticulum  proteins.  In¬ 
triguingly,  Y8-10C2  exhibited  an  almost  110-codon  delay  in  the 
appearance  of  RPFs  relative  to  the  emergence  of  its  epitope 
ending  at  residue  Y168  from  the  ribosome  exit  tunnel  (Fig.  35). 
H28-E23  also  had  a  lag  of  approximately  70  codons  after  the 
residue  E198  emerged  from  the  ribosome.  Notably,  both  mAbs 
initiated  simultaneous  binding  with  the  emergence  of  the  entire 
globular  RBD  domain,  which  ends  at  residue  E276  (Fig.  3/4). 
These  results  indicate  that  Y8-10C2  and  H28-E23  epitopes  are 
not  formed  sequentially.  Although  we  could  not  exclude  the 
possibility  of  mAb  binding-induced  nascent  chain  folding,  our 
previous  study  suggests  that  it  is  unlikely  for  such  an  event  to 
contribute  to  the  specific  RPFs  revealed  by  FactSeq  (15).  First, 
refolding  of  denatured  HA  by  antibody  binding  occurs  over  days, 
not  within  1  h.  Second,  mAb  binding-induced  HA  folding,  if 
rapid,  would  lead  to  continuous  enrichment  of  mAb-associated 
RPFs  along  with  elongation.  However,  similar  to  cotranslational 
folding  of  the  FRB  domain,  the  HA  exhibited  reduced  accessi¬ 
bility  of  Y8-10C2  after  its  initial  epitope  formation.  The  dis¬ 
continuous  antibody  accessibility  to  the  continuously  elongated 
nascent  chain  could  represent  a  previously  unrecognized  feature 
of  cotranslational  folding  pathway.  Thus,  the  coappearance  of 
distinct  epitopes  after  the  full  ectodomain  is  available  is  consis¬ 
tent  with  a  model  of  global  folding  pathway  that  occurs  rapidly 


when  the  RBD  sequence  has  emerged  from  the  ribosome 
(Fig.  3D). 

To  validate  the  cotranslational  folding  propensity  of  HA 
revealed  by  FactSeq,  we  performed  [35S]methionine  pulse-chase 
of  influenza  A  virus-infected  cells  coupled  with  IP.  Pulse-chase 
was  performed  at  20  °C  to  slow  down  elongation  and  trimeri¬ 
zation  of  HA  polypeptides.  Y8-10C2  recovers  HA  fragments  in 
addition  to  the  full-length  polypeptide.  In  contrast,  HA  trimer- 
specific  antibody  H17-L2  recovers  only  full-length  HA.  Y8- 
10C2-binding  fragments  completely  resolve  during  the  chase 
into  full-length  FIA,  demonstrating  a  precursor-product  re¬ 
lationship  and  indicative  of  the  ability  of  HA  fragments  to  gen¬ 
erate  the  Y8-10C2  conformational  epitope  (Fig.  S5/4).  Remark¬ 
ably,  the  pattern  of  HA  fragments  matches  well  with  that 
revealed  by  FactSeq  (Fig.  S55).  For  instance,  the  smallest  frag¬ 
ment  from  each  assay  was  approximately  30  kDa.  Despite  rela¬ 
tively  low  resolution,  pulse  labeling  analysis  confirms  the 
discontinuous  nature  of  mAb  binding  during  elongation.  These 
data  support  the  conclusion  that  the  high  resolution  pattern  of¬ 
fered  by  FactSeq  represents  the  true  behavior  of  nascent 
chain  synthesis. 

We  next  asked  how  mutation  of  an  epitope  affects  the  default 
folding  pathway  of  the  HA  globular  domain.  To  this  end,  we 
chose  a  PR8  escape  mutant  CV1,  whose  single  K165E  sub¬ 
stitution  reduces  Y8-10C2  avidity  more  than  100-fold  (18)  (Fig. 
S6).  Only  background  levels  of  RPF  reads  were  recovered  by  Y8- 
10C2  from  CV1,  providing  an  important  confirmation  of  the 
specificity  of  the  FactSeq  method  (Fig.  3C).  Interestingly, 
whereas  CV1  maintained  the  similar  codon  lag  as  WT  HA  in  the 
generation  of  H28-E23  epitope,  the  full  extent  of  epitope  for¬ 
mation  was  delayed,  indicating  that  the  single  K165E  sub¬ 
stitution  alters  the  kinetics  of  the  cotranslational  folding  of  HA 
(Fig.  3  C  and  D). 

Conclusion 

It  is  widely  believed  that  cotranslational  folding  is  a  universal 
feature  of  newly  synthesized  polypeptides  (1,  2).  However, 
monitoring  this  dynamic  process  is  challenging,  in  particular  in¬ 
side  mammalian  cells.  By  harnessing  the  power  of  the  ribosome 
profiling  technique  with  the  folding-sensitive  affinity  reagents, 
FactSeq  provides  a  unique  view  of  the  folding  competency  of  the 
nascent  chain  during  its  elongation.  The  acquisition  of  the 
functional  FRB  and  the  HA  RBD  immediately  after  their  se¬ 
quences  emerge  from  the  ribosome  exit  tunnel  strongly  favors  a 
domain-wise  global  folding  pathway.  Despite  the  limitations  of 
FactSeq  in  providing  real-time  kinetics  of  folding  pathway,  the 
snapshot  taken  by  FactSeq  consists  of  continuous  frames  of 
ribosomes  with  varied  length  of  nascent  chains.  FactSeq  also 
requires  folding-sensitive  affinity  reagents  to  capture  specific 
folding  status  of  ribosome-attached  nascent  chains.  With  the 
increasing  number  of  available  conformation-specific  antibodies 
or  binding  factors  for  various  gene  products  (19),  FactSeq  is 
readily  applicable  to  endogenous  proteins  thanks  to  the  high 
throughput  of  deep  sequencing  that  covers  the  entire  tran- 
scriptome.  In  addition,  taking  advantage  of  the  intrabodies  that 
are  designed  to  be  expressed  intracellularly  (20,  21),  FactSeq  has 
the  potential  to  capture  cotranslational  folding  in  live  cells  be¬ 
fore  cell  lysis.  Beyond  cotranslating  folding,  the  prototype  of 
FactSeq  can  also  be  applied  to  other  cotranslational  events.  For 
instance,  by  using  an  NHo-terminal  tag,  the  same  concept  can  be 
adapted  to  investigate  cotranslational  degradation  by  comparing 
the  loss,  rather  than  the  gain,  of  RPFs.  Finally,  FactSeq  can  also 
be  expanded  to  study  cotranslational  chaperone  interaction  with 
nascent  chains  (7,  22).  The  applicability  of  FactSeq  is  not  limited 
to  studying  cotranslational  events.  The  basic  principle  can  be 
used  to  design  in  vivo  folding  reporter  to  investigate  cellular 
factors  influencing  cotranslational  folding.  We  envision  that  this 
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approach  will  provide  novel  insights  into  protein  triage  decisions 
under  physiological  as  well  as  pathological  conditions. 

Materials  and  Methods 

Cells  and  Reagents.  HEK293  cells  stably  expressing  Flag-FRB-GFP  were  main¬ 
tained  in  DM  EM  with  10%  (vol/vol)  FBS.  Rapalog  AP21967  was  provided  by 
Ariad.  Anti-Flag  and  anti-HA  antibodies  were  purchased  from  Sigma,  and  pro¬ 
tein  A/G  beads  from  Santa  Cruz.  TRIzol  reagent  was  purchased  from  Invitrogen. 

Influenza  A  Infection.  HeLa  cells  were  infected  with  Influenza  A/PR8  strain  at 
a  multiplicity  of  20  pfu/cell  in  AIM  medium,  pH  6.6.  After  adsorption  at  37  °C 
for  1  h,  infected  monolayers  were  overlaid  with  DMEM  containing  7.5%  FBS, 
and  incubated  for  an  additional  5  h. 

Ribosome  Profiling.  Sucrose  solutions  were  prepared  in  polysome  gradient 
buffer  [10  mM  Hepes,  pH  7.4,  100  mM  KCI,  5  mM  MgCI2,  100  pg/mL  cyclo- 
heximide,  5  mM  DTT,  and  20  U/mL  SUPERaseJn  (Ambion)].  Sucrose  density 
gradients  [15-45%  (wt/vol)]  were  freshly  made  in  SW41  ultracentrifuge 
tubes  (Fisher)  using  a  BioComp  Gradient  Master  (BioComp)  according  to  the 
manufacturer's  instructions.  HEK293/Flag-FRB-GFP  cells  were  plated  to  four 
10-cm  dishes  before  ribosome  profiling.  Cells  were  first  treated  with  cyclo- 
heximide  (100  pg/ml_)  for  3  min  at  37  °C  to  freeze  the  translating  ribosomes, 
followed  by  ice-cold  PBS  solution  wash.  Cells  were  then  harvested  by  ice-cold 
polysome  lysis  buffer  [10  mM  Hepes,  pH  7.4,  100  mM  KCI,  5  mM  MgCI2, 
100  pg/mL  cycloheximide,  5  mM  DTT,  20  U/mL  SUPERaseJn,  and  2%  (vol/vol) 
Triton  X-100].  After  centrifugation  at  4  °C  and  10,000  x  g  for  10  min,  ap¬ 
proximately  650  pL  supernatant  was  loaded  onto  sucrose  gradients,  fol¬ 
lowed  by  centrifugation  for  100  min  at  38,000  rpm,  4  °C,  in  an  SW41  rotor. 
Separated  samples  were  fractionated  at  0.375  mL/min  by  using  a  fraction¬ 
ation  system  (Isco)  that  continually  monitored  OD254  values.  Fractions  were 
collected  into  tubes  at  1-min  intervals. 

Ribosome  Purification.  To  convert  the  polysome  into  monosome,  E.  coli  RNase 
I  (Ambion)  was  added  into  the  pooled  polysome  samples  (750  U  per  100 
A260  units)  and  incubated  at  4  °C  for  1  h.  Preclearance  was  conducted  by 
incubating  the  ribosome  samples  with  30  pL  protein  A/G  beads  coated  with 
4%  BSA  for  1  h  at  room  temperature.  For  IP  using  mAbs,  30  pL  protein  A/G 
beads  were  first  incubated  with  5  pg  mAbs  for  1  h  at  room  temperature 
followed  by  blocking  with  4%  BSA  for  1  h.  The  mAb-coated  beads  were  then 
incubated  with  the  precleared  ribosome  samples  at  4  °C  for  1  h,  followed  by 
washing  with  polysome  lysis  buffer  for  three  times.  For  FKBP  binding  assay, 
20  pg  recombinant  HA-FKBP  proteins  purified  from  E.  coli  (BL21)  were  first 
immobilized  on  protein  A/G  beads  using  anti-HA  antibody.  After  blocking 
with  4%  BSA  for  1  h,  the  beads  were  then  incubated  with  the  precleared 
ribosome  samples  at  4  °C  for  1  h  in  the  absence  or  presence  of  1  pM  rapalog. 
After  washing  with  polysome  lysis  buffer  three  times,  total  RNA  extraction 
was  performed  by  using  TRIzol  reagent. 

cDNA  Library  Construction  of  Ribosome-Protected  mRNA  Fragments.  Purified 
RNA  samples  were  first  mixed  with  1  nM  of  synthetic  28-nt  random  RNA  (5  - 
AUGUACACGGAGUCGACCCGCAACGCGA-3')  as  the  spike-in  control.  The 
mixed  RNA  samples  were  then  dephosphorylated  in  a  15  pL  reaction  con¬ 
taining  lx  T4  polynucleotide  kinase  buffer,  10  U  SUPERaseJn,  and  20  U  T4 
polynucleotide  kinase  (NEB).  Dephosphorylation  was  carried  out  for  1  h  at 
37  °C,  and  the  enzyme  was  then  heat-inactivated  for  20  min  at  65  °C. 
Dephosphorylated  samples  were  mixed  with  2x  Novex  TBE-Urea  sample 
buffer  (Invitrogen)  and  loaded  on  a  Novex  denaturing  15%  polyacrylamide 
TBE-urea  gel  (Invitrogen).  The  gel  was  stained  with  SYBR  Gold  (Invitrogen) 
to  visualize  the  RNA  fragments.  Gel  bands  containing  RNA  species  corre¬ 
sponding  to  28  nt  were  excised  and  physically  disrupted  by  using  centrifu¬ 
gation  through  the  holes  of  the  tube.  RNA  fragments  were  dissolved  by 
soaking  overnight  in  gel  elution  buffer  (300  mM  NaOAc,  pH  5.5,  1  mM  EDTA, 
0.1  U/mL  SUPERaseJn).  The  gel  debris  was  removed  using  a  Spin-X  column 
(Corning)  and  RNA  was  purified  by  using  ethanol  precipitation. 

Purified  RNA  fragments  were  resuspended  in  10  mM  Tris  (pH  8)  and 
denatured  briefly  at  65  °C  for  30  s  Poly-(A)  tailing  reaction  was  performed  in 
a  8  pL  with  1  x  poly-(A)  polymerase  buffer,  1  mM  ATP,  0.75  U/pL  SUPER¬ 
aseJn,  and  3  U  E.  coli  poly-(A)  polymerase  (NEB).  Tailing  was  carried  out  for 
45  min  at  37  °C.  For  reverse  transcription,  the  following  oligos  containing 
barcodes  were  synthesized: 

MCA02,  5'-pCAGATCGTCGGACTGTAGAACTCTCAAGCAGAAGACGGCATAC- 
GATTTTTTTTTTTTTTTTTTTTVN-3';  LGT03,  5'-pGT GATCGT CGG ACT GTAG AA- 
CT CT CAAG CAG AAG ACG G CAT ACG ATT  TTTTTTTTTTTTTTTTTTVN-3';  YAG04, 
5  -pAGGATCGTCGGACTGTAGAACTCTCAAGCAGAAGACGGCATACGATT  TTTT- 


TTTTTTTTTTTTTT VN-3 HTC05,  5'-pTCGATCGTCGGACTGTAGAACTCTCAAG- 
CAG  AAG  ACG  G  CAT ACG  ATT  TTTTTTTTTTTTTTTTTTVN-3'. 

In  brief,  the  tailed  RNA  product  was  mixed  with  0.5  mM  dNTP  and  2.5  mM 
synthesized  primer  and  incubated  at  65  °C  for  5  min,  followed  by  incubation 
on  ice  for  5  min.  The  reaction  mix  was  then  added  with  20  mM  Tris  (pH  8.4), 
50  mM  KCI,  5  mM  MgCI,  10  mM  DTT,  40  U  RNaseOUT,  and  200  U  Superscript 
III  (Invitrogen).  RT  reaction  was  performed  according  to  the  manufacturer's 
instructions.  RNA  was  eliminated  from  cDNA  by  adding  1 .8  pL  1  M  NaOH  and 
incubating  at  98  °C  for  20  min.  The  reaction  was  then  neutralized  withl  .8  pL 
1  M  HCI.  Reverse  transcription  products  were  separated  on  a  10%  poly¬ 
acrylamide  TBE-urea  gel  as  described  earlier.  The  extended  first-strand 
product  band  was  expected  to  be  approximately  100  nt,  and  the  corre¬ 
sponding  region  was  excised.  The  cDNA  was  recovered  by  using  DNA  gel 
elution  buffer  (300  mM  NaCI,  1  mM  EDTA). 

First-strand  cDNA  was  circularized  in  20  pL  of  reaction  containing  lx  Cir- 
cLigase  buffer,  2.5  mM  MnCI2,  1  M  Betaine,  and  100  U  CircLigase  II  (Epicentre). 
Circularization  was  performed  at  60  °C  for  1  h,  and  the  reaction  was  heat- 
inactivated  at  80  °C  for  10  min.  Circular  single-strand  DNA  was  relinearized 
with  20  mM  Tris-acetate,  50  mM  potassium  acetate,  10  mM  magnesium  ac¬ 
etate,  1  mM  DTT,  and  7.5  U  APE  1  (NEB).  The  reaction  was  carried  out  at  37  °C 
for  1  h.  The  linearized  single-strand  DNA  was  separated  on  a  Novex  10% 
polyacrylamide  TBE-urea  gel  (Invitrogen)  as  described  earlier.  The  expected 
100-nt  product  bands  were  excised  and  recovered  as  described  earlier. 

Deep  Sequencing.  Single-stranded  template  was  amplified  by  PCR  by  using  the 
Phusion  High-Fidelity  enzyme  (NEB)  according  to  the  manufacturer's 
instructions.  The  oligonucleotide  primers  qNTI200  (5'-CAAGCAGAAGACGG- 
CATA-3')  and  qNTI201  (5'-AATGATACGGCGACCACCG  ACAGGTTCAGAG- 
TTCTACAGTCCGACG-3')  were  used  to  create  DNA  suitable  for  sequencing, 
i.e.,  DNA  with  lllumina  cluster  generation  sequences  on  each  end  and  a  se¬ 
quencing  primer  binding  site.  The  PCR  contains  lx  HF  buffer,  0.2  mM  dNTP, 
0.5  pM  oligonucleotide  primers,  and  0.5  U  Phusion  polymerase.  PCR  was 
carried  out  with  an  initial  30  s  denaturation  at  98  °C,  followed  by  12  cycles  of 
10  s  denaturation  at  98  °C,  20  s  annealing  at  60  °C,  and  10  s  extension  at 
72  °C.  PCR  products  were  separated  on  a  nondenaturing  8%  polyacrylamide 
TBE  gel  as  described  earlier.  Expected  DNA  at  120  bp  was  excised  and  re¬ 
covered  as  described  earlier.  After  quantification  by  Agilent  BioAnalyzer  DNA 
1000  assay,  equal  amount  of  barcoded  samples  were  pooled  into  one  sample. 
Approximately  3-5  pM  mixed  DNA  samples  were  used  for  cluster  generation 
followed  by  sequencing  by  using  sequencing  primer  5'-CGACAGGTTCA- 
GAGTTC  TACAGTCCGACGATC-3'  (lllumina  Genome  Analyzer  2  or  HiSeq). 

Data  Analysis.  The  deep  sequencing  data  of  ribosome  footprints  was  pro¬ 
cessed  and  analyzed  by  using  a  collection  of  custom  Perl  scripts.  The  barcoded 
multiplex  sequencing  output  files  were  separated  into  individual  sample 
datasets  according  to  the  first  2-nt  barcodes.  Second,  the  3'  polyA  tails 
allowing  one  mismatch  were  identified  and  removed.  After  that,  the  high- 
quality  reads  of  length  ranging  from  25  to  35  nt  were  retained  whereas 
other  reads  were  excluded  from  the  downstream  analysis.  The  sequences  of 
the  longest  transcript  isoform  for  each  human  gene  were  downloaded  from 
the  Ensembl  database  to  construct  a  human  transcriptome  reference.  In 
addition,  the  sequence  of  HA  (NC_002017)  from  the  National  Center  for 
Biotechnology  Information  database  and  the  plasmid  sequence  of  Flag-FRB- 
GFP  were  used  as  the  reference.  The  trimmed  reads  were  aligned  to  the 
corresponding  reference  transcripts  by  SOAP  2.0,  allowing  as  many  as  two 
mismatches,  and  only  unique  mapping  hits  were  retained.  Last,  the  5'  end 
positions  of  aligned  reads  were  mapped  into  the  coding  frame  and  the 
number  of  reads  was  counted  at  each  codon  ranging  from  -20  codon  5'  UTR 
to  the  stop  codon  for  the  downstream  analysis. 

The  reproducibility  of  the  RPF  distribution  on  individual  transcripts  was 
evaluated  by  Pearson  correlation.  The  replicates  were  clustered  by  Cluster  3.0, 
and  heat  maps  were  produced  by  Treeview.  To  compare  the  RPF  distribution 
on  transcript  before  and  after  the  affinity  purification,  the  reads  in  the  first 
30-codon  window  were  considered  as  the  background  because  the  poly¬ 
peptides  are  still  buried  within  the  ribosome  exit  tunnel  and  cannot  be 
accessed  by  binding  partners.  The  significance  (i.e.,  P  value)  of  the  RPF 
density  vs.  background  within  a  10-codon  sliding  window  in  the  pull-down 
sample  was  calculated  by  Fisher  exact  test  across  the  transcripts  compared 
with  the  total  sample.  The  first  position  at  which  the  P  value  was  less  than 
0.001  was  considered  as  the  folding  start  point.  Based  on  the  number  of 
reads  after  the  folding  start  point,  the  total  and  pull-down  data  were  nor¬ 
malized  to  the  same  scale.  To  decrease  the  counting  error,  only  the  positions 
with  reads  above  the  mean  reads  density  in  the  total  sample  were  treated  as 
comparable  sites.  Subsequently,  the  single  codon  peak  ratio  was  calculated 
by  dividing  the  normalized  reads  of  pull-down  sample  to  those  of  total 
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sample  at  the  same  codon.  The  trend  line  of  the  single  codon  peak  ratio  was 
determined  by  locally  estimated  scatterplot  smoothing  (LOESS)by  using 
SigmaPlot  11.0  (Systat). 
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